Solving Generalized Semi-Markov Decision Processes Using Continuous Phase-Type Distributions
نویسندگان
چکیده
We introduce the generalized semi-Markov decision process (GSMDP) as an extension of continuous-time MDPs and semi-Markov decision processes (SMDPs) for modeling stochastic decision processes with asynchronous events and actions. Using phase-type distributions and uniformization, we show how an arbitrary GSMDP can be approximated by a discrete-time MDP, which can then be solved using existing MDP techniques. The techniques we present can also be seen as an alternative approach for solving SMDPs, and we demonstrate that the introduction of phases allows us to generate higher quality policies than those obtained by standard SMDP solution techniques.
منابع مشابه
Solving Generalized Semi-Markov Processes using Continuous Phase-Type Distributions
We introduce the generalized semi-Markov decision process (GSMDP) as an extension of continuous-time MDPs and semi-Markov decision processes (SMDPs) for modeling stochastic decision processes with asynchronous events and actions. Using phase-type distributions and uniformization, we show how an arbitrary GSMDP can be approximated by a discrete-time MDP, which can then be solved using existing M...
متن کاملQ-MAM: a tool for solving infinite queues using matrix-analytic methods
In this paper we propose a novel MATLAB tool, called Q-MAM, to compute queue length, waiting time and sojourn time distributions of various discrete and continuous time queuing systems with an underlying structured Markov chain/process. The underlying paradigms include M/G/1and GI/M/1-type, quasi-birth-death and non-skip-free Markov chains (implemented by the SMCSolver tool), as well as Markov ...
متن کاملContinuity of Generalized Semi-Markov Processes
It is shown that sequences of generalized semi-Markov processes converge in the sense of weak convergence of random functions if associated sequences of defining elements (initial distributions, transition functions and clock time distributions) converge. This continuity or stability is used to obtain information about invariant probability measures. It is shown that there exists an invariant p...
متن کاملMean-Payoff Optimization in Continuous-Time Markov Chains with Parametric Alarms
Continuous-time Markov chains with alarms (ACTMCs) allow for alarm events that can be non-exponentially distributed. Within parametric ACTMCs, the parameters of alarm-event distributions are not given explicitly and can be subject of parameter synthesis. An algorithm solving the ε-optimal parameter synthesis problem for parametric ACTMCs with long-run average optimization objectives is presente...
متن کاملA Fast Analytical Algorithm for Solving Markov Decision Processes with Real-Valued Resources
Agents often have to construct plans that obey deadlines or, more generally, resource limits for real-valued resources whose consumption can only be characterized by probability distributions, such as execution time or battery power. These planning problems can be modeled with continuous state Markov decision processes (MDPs) but existing solution methods are either inefficient or provide no gu...
متن کامل